AITopics | weighted majority voting

Learning-from-crowds aims to design proper aggregation strategies to infer the unknown true labels from the noisy labels provided by ordinary web workers.

artificial intelligence, machine learning, majority voting, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

A Latent Source Model for Nonparametric Time Series Classification

Neural Information Processing SystemsSep-30-2025, 12:32:36 GMT

For classifying time series, a nearest-neighbor approach is widely used in practice with performance often competitive with or better than more elaborate methods such as neural networks, decision trees, and support vector machines. We develop theoretical justification for the effectiveness of nearest-neighbor-like classification of time series. Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren't actually that many prototypical time series to begin with, relative to the number of time series we have access to, e.g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data. To operationalize this hypothesis, we propose a latent source model for time series, which naturally leads to a weighted majority voting" classification rule that can be approximated by a nearest-neighbor classifier. We establish nonasymptotic performance guarantees of both weighted majority voting and nearest-neighbor classification under our model accounting for how much of the time series we observe and the model complexity. Experimental results on synthetic data show weighted majority voting achieving the same misclassification rate as nearest-neighbor classification while observing less of the time series. We then use weighted majority to forecast which news topics on Twitter become trends, where we are able to detect such "trending topics" in advance of Twitter 79% of the time, with a mean early advantage of 1 hour and 26 minutes, a true positive rate of 95%, and a false positive rate of 4%."

nonparametric time sery classification, time sery, weighted majority voting, (5 more...)

Neural Information Processing Systems

Industry: Information Technology > Services (0.84)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Retrieval-Augmented Generation with Estimation of Source Reliability

Hwang, Jeongyeon, Park, Junyoung, Park, Hyejin, Park, Sangdon, Ok, Jungseul

arXiv.org Artificial IntelligenceOct-30-2024

Retrieval-augmented generation (RAG) addresses key limitations of large language models (LLMs), such as hallucinations and outdated knowledge, by incorporating external databases. These databases typically consult multiple sources to encompass up-to-date and various information. However, standard RAG methods often overlook the heterogeneous source reliability in the multi-source database and retrieve documents solely based on relevance, making them prone to propagating misinformation. To address this, we propose Reliability-Aware RAG (RA-RAG) which estimates the reliability of multiple sources and incorporates this information into both retrieval and aggregation processes. Specifically, it iteratively estimates source reliability and true answers for a set of queries with no labelling. Then, it selectively retrieves relevant documents from a few of reliable sources and aggregates them using weighted majority voting, where the selective retrieval ensures scalability while not compromising the performance. We also introduce a benchmark designed to reflect real-world scenarios with heterogeneous source reliability and demonstrate the effectiveness of RA-RAG compared to a set of baselines.

dataset, information, reliability, (15 more...)

arXiv.org Artificial Intelligence

2410.22954

Country:

Atlantic Ocean (0.04)
Southern Ocean (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.67)
Media > News (0.50)
Law > Statutes (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CMU-MATH team's innovative approach secures 2nd place at the AIMO prize

AIHubAug-19-2024, 09:02:46 GMT

The Artificial Intelligence Mathematical Olympiad (AIMO) Prize, initiated by XTX Markets, is a pioneering competition designed to revolutionize AI's role in mathematical problem-solving. It pushes the boundaries of AI by solving complex mathematical problems akin to those in the International Mathematical Olympiad (IMO). The advisory committee of AIMO includes Timothy Gowers and Terence Tao, both winners of the Fields Medal. Attracting attention from world-class mathematicians as well as machine learning researchers, the AIMO sets a new benchmark for excellence in the field. AIMO has introduced a series of progress prizes.

majority voting, policy model, reward model, (13 more...)

AIHub

Country: North America > United States (0.05)

Genre:

Research Report > Promising Solution (0.40)
Overview > Innovation (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.33)

Add feedback

SignSGD with Federated Voting

Park, Chanho, Poor, H. Vincent, Lee, Namyoon

arXiv.org Artificial IntelligenceMar-24-2024

Distributed learning is commonly used for accelerating model training by harnessing the computational capabilities of multiple-edge devices. However, in practical applications, the communication delay emerges as a bottleneck due to the substantial information exchange required between workers and a central parameter server. SignSGD with majority voting (signSGD-MV) is an effective distributed learning algorithm that can significantly reduce communication costs by one-bit quantization. However, due to heterogeneous computational capabilities, it fails to converge when the mini-batch sizes differ among workers. To overcome this, we propose a novel signSGD optimizer with \textit{federated voting} (signSGD-FV). The idea of federated voting is to exploit learnable weights to perform weighted majority voting. The server learns the weights assigned to the edge devices in an online fashion based on their computational capabilities. Subsequently, these weights are employed to decode the signs of the aggregated local gradients in such a way to minimize the sign decoding error probability. We provide a unified convergence rate analysis framework applicable to scenarios where the estimated weights are known to the parameter server either perfectly or imperfectly. We demonstrate that the proposed signSGD-FV algorithm has a theoretical convergence guarantee even when edge devices use heterogeneous mini-batch sizes. Experimental results show that signSGD-FV outperforms signSGD-MV, exhibiting a faster convergence rate, especially in heterogeneous mini-batch sizes.

algorithm, error probability, probability, (14 more...)

arXiv.org Artificial Intelligence

2403.16372

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

a8849b052492b5106526b2331e526138-Reviews.html

Neural Information Processing SystemsMar-13-2024, 19:22:52 GMT

The authors examine theoretically and empirically properties of the k 1-nearest neighbor classification (NN) and a trivial variant of the kernel density classification, called "weighted majority voting", in a time-series binary classification problem. For theoretical analysis, a simplified generative model for time-series is introduced. Under this model, they provide non-asymptotic performance guarantees in terms of how large of a training dataset and how much of the time-series length to be classified. Although the theoretical analysis is technically sound, the empirical work will be incomplete. In the experiments, the weighted majority voting and the NN are just tested while data analysts would at least use the k-NN, rather than the NN, with a tuned'k' by cross-validation or so.

classification, theoretical analysis, weighted majority voting, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.85)

Add feedback

A Latent Source Model for Nonparametric Time Series Classification George H. Chen Stanislav Nikolov Devavrat Shah MIT

Neural Information Processing SystemsMar-13-2024, 19:22:52 GMT

For classifying time series, a nearest-neighbor approach is widely used in practice with performance often competitive with or better than more elaborate methods such as neural networks, decision trees, and support vector machines. We develop theoretical justification for the effectiveness of nearest-neighbor-like classification of time series. Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren't actually that many prototypical time series to begin with, relative to the number of time series we have access to, e.g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data. To operationalize this hypothesis, we propose a latent source model for time series, which naturally leads to a "weighted majority voting" classification rule that can be approximated by a nearest-neighbor classifier. We establish nonasymptotic performance guarantees of both weighted majority voting and nearest-neighbor classification under our model accounting for how much of the time series we observe and the model complexity. Experimental results on synthetic data show weighted majority voting achieving the same misclassification rate as nearest-neighbor classification while observing less of the time series. We then use weighted majority to forecast which news topics on Twitter become trends, where we are able to detect such "trending topics" in advance of Twitter 79% of the time, with a mean early advantage of 1 hour and 26 minutes, a true positive rate of 95%, and a false positive rate of 4%.

latent source, majority voting, time sery, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Max-Margin Majority Voting for Learning from Crowds

Neural Information Processing SystemsMar-13-2024, 03:59:50 GMT

Learning-from-crowds aims to design proper aggregation strategies to infer the unknown true labels from the noisy labels provided by ordinary web workers.

constraint, estimator, majority voting, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

A Latent Source Model for Nonparametric Time Series Classification

Neural Information Processing SystemsApr-6-2023, 11:57:25 GMT

For classifying time series, a nearest-neighbor approach is widely used in practice with performance often competitive with or better than more elaborate methods such as neural networks, decision trees, and support vector machines. We develop theoretical justification for the effectiveness of nearest-neighbor-like classification of time series. Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren't actually that many prototypical time series to begin with, relative to the number of time series we have access to, e.g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data. To operationalize this hypothesis, we propose a latent source model for time series, which naturally leads to a weighted majority voting" classification rule that can be approximated by a nearest-neighbor classifier. We establish nonasymptotic performance guarantees of both weighted majority voting and nearest-neighbor classification under our model accounting for how much of the time series we observe and the model complexity. Experimental results on synthetic data show weighted majority voting achieving the same misclassification rate as nearest-neighbor classification while observing less of the time series. We then use weighted majority to forecast which news topics on Twitter become trends, where we are able to detect such "trending topics" in advance of Twitter 79% of the time, with a mean early advantage of 1 hour and 26 minutes, a true positive rate of 95%, and a false positive rate of 4%."

nonparametric time sery classification, time sery, weighted majority voting, (5 more...)

Neural Information Processing Systems

Industry: Information Technology > Services (0.84)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Stability of Weighted Majority Voting under Estimated Weights

Bai, Shaojie, Wang, Dongxia, Muller, Tim, Cheng, Peng, Chen, Jiming

arXiv.org Artificial IntelligenceJul-13-2022

Weighted Majority Voting (WMV) is a well-known optimal decision rule for collective decision making, given the probability of sources to provide accurate information (trustworthiness). However, in reality, the trustworthiness is not a known quantity to the decision maker - they have to rely on an estimate called trust. A (machine learning) algorithm that computes trust is called unbiased when it has the property that it does not systematically overestimate or underestimate the trustworthiness. To formally analyse the uncertainty to the decision process, we introduce and analyse two important properties of such unbiased trust values: stability of correctness and stability of optimality. Stability of correctness means that the decision accuracy that the decision maker believes they achieved is equal to the actual accuracy. We prove stability of correctness holds. Stability of optimality means that the decisions made based on trust, are equally good as they would have been if they were based on trustworthiness. Stability of optimality does not hold. We analyse the difference between the two, and bounds thereon. We also present an overview of how sensitive decision correctness is to changes in trust and trustworthiness.

correctness, probability, trustworthiness, (12 more...)

arXiv.org Artificial Intelligence

2207.06118

Country: